Arabic Information Retrieval: A Relevancy Assessment Survey

نویسندگان

  • Ahmad Hussein Ababneh
  • Joan Lu
  • Qiang Xu
چکیده

The paper presents a research in Arabic Information Retrieval (IR). It surveys the impact of statistical and morphological analysis of Arabic text in improving Arabic IR relevancy. We investigated the contributions of Stemming, Indexing, Query Expansion, Text Summarization (TS), Text Translation, and Named Entity Recognition (NER) in enhancing the relevancy of Arabic IR. Our survey emphasizing on the quantitative relevancy measurements provided in the surveyed publications. The paper shows that the researchers achieved significant enhancements especially in building accurate stemmers, with accuracy reaches 97%, and in measuring the impact of different indexing strategies. Query expansion and Text Translation showed positive relevancy effect. However, other tasks such as NER and TS still need more research to realize their impact on Arabic IR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The socio - cognitive theory in information retrieval (IR)

Abstract Background and Aim: The socio-cognitive theory introduced in information science by Horland and Alberchtsen. The socio-cognitive view turns the traditional cognitive program upside down. The socio-cognitive theory emphasizes on different cultural and social structures of users. Hence, the aim of the article is to explain the role of socio - cognitive theory in information retrieval (I...

متن کامل

بررسی تأثیرات ریشه‌یابی در بازیابی اطلاعات در زبان فارسی

Using the language-specific behavior in information retrieval systems can improve the quality of the retrieved results significantly. Part of the word that remains after removing its affixes is called stem. Stemming process can be used for improving the relevancy of the results in information retrieval system. Different morphological variants of words (plural, past tense…) will be mapped into t...

متن کامل

Relevancy in schema agnostic environment

Relevance is an important component in full-text search and often distinguishes the implementations. Relevancy is used to score matching documents and rank them according to the users intent. One of the reasons of the high popularity of Google is its good relevancy originally based on the PageRank algorithm. The emergence of semi-structured data as a standard for data representation opened up n...

متن کامل

Mean Field Approach to a Probabilistic Model in Information Retrieval

We study an explicit parametric model of documents, queries, and relevancy assessment for Information Retrieval (IR). Mean-field methods are applied to analyze the model and derive efficient practical algorithms to estimate the parameters in the problem. The hyperparameters are estimated by a fast approximate leave-one-out cross-validation procedure based on the cavity method. The algorithm is ...

متن کامل

Automatic Evaluation of Search Engines with Social Relevancy Rank Factoring

In the past precision of information retrieval systems has been evaluated based on general subject queries and some specific query domains both manually and automatically. Manually evaluating the effectiveness of information retrieval systems, in terms of relevance, requires a large amount of human effort and time. Automatic evaluation is much better in adapting to the fast changing web and sea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016